Forestalling long time-outs in processes

ABSTRACT

The present invention relates to methods, devices, systems and computer program products which avoid long time-outs in processes run on two or more computers. A first computer ( 10 ) comprises an application unit ( 16 ) performing computational tasks, and a status determining unit ( 20 ) connected to the application unit. The status determining unit sends a request for status to a second computer ( 14 ), which second computer is to perform some computational tasks for the application unit, and automatically blocks request for processing of data to the second computer if no response is received from the second computer, where the request for processing is caused by the application. The second computer comprises an application unit ( 26 ) performing the computational tasks and a status responding unit ( 22 ) which receives the request for status from the first computer, generates at least one response to the request, and sends the response to the first computer.

The present invention generally relates to the field of computers andcomputer networks and in particular to devices, methods, systems andcomputer program products used for forestalling long time-outs inprocesses run on two or more computers.

In the field of computers it is well known to have a process running onone computer, which from time to time needs to have some computationsmade on another computer. This type of communication is normallyprovided in the form of session oriented client-server communication.

In these types of systems the client calls the server with a request fordata processing. One type of software providing this type offunctionality is the Distributed Component Model (DCOM) provided byMicrosoft. In DCOM a remote procedure call (RPC) is made from a clientto a server using TCP/IP as protocol. Because of instability of thenetwork these calls can be hanging for very long periods of time if theserver or client has lost its network connection. This is due to thefact that the DCOM tries to repeat calling the server several timesbefore an error signal is generated. These types of time outs can be aslong as 6 minutes, which is unacceptable in many applications.

EP 940 750 describes a system including a client and a server, wherewhen sending a message from a client object to a server object a dataarea is reserved. If processing executed by the server object iscorrectly completed, the data indicating the processing result is storedin the data area. If processing executed by the server object is notcorrectly completed, the data indicating the status of the server objectis stored in the data area. By reading the data in the data area, theclient object receives the data of the processing result if theprocessing on the server object has been correctly completed. If theprocessing of the server object has not been correctly completed, theclient object receives the status data. This document does however notdescribe how long time-outs can be avoided if processing cannot becompleted, for instance because of no network connection.

The present invention is directed towards solving the problem of longtime-outs occurring when calls for processing are made to servers havingproblems giving computational results to a client. The invention is thusdirected towards providing a client, which makes it possible to avoidlong time-outs.

This problem is solved by a method of forestaling long time-outs in aprocess run on at least a first computing device in a network, whichprocess makes calls for processing to a second computing device, andcomprising the steps of: sending a request for status to a secondcomputing device, and in case of no response on the request for statusfrom the second computing device automatically blocking requests forprocessing of data to be sent the second computing device.

This problem is also solved by a computing device for connection toother computing devices via a network comprising: an application unitperforming computational tasks and making requests for processing toanother computer device, a status determining unit connected to theapplication unit and arranged to send a request for status to the othercomputing device, which other computing device is to perform acomputational task for the application unit, and automatically blockrequest for processing of data to the second computing device if noresponse is received from the other computing device, which request forprocessing is caused by the application.

This problem is also solved by a program product comprising a computerreadable medium, having thereon: computer program code means, to make acomputer execute, when said program is loaded in the computer: sendingof a request for status to another computer, and in case of no responseon the request for status from the other computer, automaticallyblocking requests for processing of data to the other computer.

According to another aspect of the invention, there is provided a methodand a server, which facilitates the avoiding of long-time outs in aclient.

This is achieved by a method of determining status of a computing deviceused for receiving calls for processing from another computing devicevia a network, comprising the steps of: receiving a request for statusfrom the other computing device, generating at least one response to therequest, and sending the response to the other computing device.

This is also achieved by a computing device for connection to othercomputing devices via a network and comprising: an application unitperforming computational tasks for another computational unit when beingrequested to do so by the other computing device and a status respondingunit arranged to receive a request for status from the other computingdevice, generate at least one response to the request, and send theresponse to the other computing device.

This is also achieved by a program product comprising a computerreadable medium, having thereon: computer program code means to make acomputer execute, when said program is loaded in the computer: receivinga request for status from another computer, generating at least oneresponse to the request, and sending the response to the other computer.

According to yet another aspect of the invention there is provided acomputer network, which makes it possible to avoid long time-outsbetween a client and a server present in normal networks.

This is achieved by a system of computing devices including at least afirst and a second computing device connected to each other via anetwork, the first computing device comprising: an application unitperforming computational tasks, a status determining unit connected tothe application unit and arranged to send a request for status to thesecond computing device, which second computing device is to performsome computational tasks for the application unit, and automaticallyblock request for processing of data to the second computing device ifno response is received from the second computing device, which requestfor processing is caused by the application, the second computing devicecomprising: an application unit performing computational tasks for thefirst computational unit and a status responding unit arranged toreceive the request for status from the first computing device, generateat least one response to the request, and send the response to the firstcomputing device.

These and other aspects of the invention will be apparent from andelucidated with reference to the embodiments described hereinafter.

The present invention will in the following be described with referencemade to the accompanying drawings, in which

FIG. 1 shows a schematic drawing of two computer devices connected toeach other via a network,

FIG. 2 shows a block schematic of the two computer devices of FIG. 1,

FIG. 3 shows a block schematic of a status determining unit in a firstcomputing device connected to a status responding unit in a secondcomputing device,

FIG. 4 shows a flow chart of a first part of a first method according tothe invention performed in the first computing device,

FIG. 5 shows a flow chart of a second part of the first method,

FIG. 6 shows a flow chart of a third part of the first method,

FIG. 7 shows a flow chart of a first part of a second method accordingto the invention performed in the second computing device,

FIG. 8 shows a flow chart of a second part of the second method, and

FIG. 9 shows a computer readable medium, where program code forperforming the first and/or the second method is stored.

Two computing devices or computers 10, 14 are connected to each othervia a computer network 12. The network supports at least two types ofcommunication protocols like TCP/IP and UDP/IP. In FIG. 1 is only showntwo computers for easier understanding of the invention. It should beunderstood that more computers can readily be connected to the network.

FIG. 2 shows a simplified block schematic of the two computers 10 and 14connected to each other via the network 12. Here the network 12 is shownas a straight line. A first of the computers 10 is a so-called client,while a second of the computers 14 is a server. The first computer 10has an application unit 16 connected to a client DCOM (DistributedComponent Model) unit 18, which communicates with a corresponding serverDCOM unit 24 in the second computer 14. The application unit 16 and theDCOM unit 18 are also connected to a first status determining unit orHPS (Host polling service) client 20, which also communicates with astatus responding unit or HPS server 22 in the second computer 14. Theserver DCOM unit 24 is connected to a server application unit 26 forperforming different computing tasks for the client application. Whenthe client DCOM unit 18 calls the second computer 14, there can be longtime-outs if the second computer has no network connection. This is dueto the fact that the client DCOM unit 18 will retry calling the secondcomputer several times before an error is generated. This is notacceptable and the present invention solves this problem through the useof the status determining unit 20 in the first computer 10 with the helpof the status responding unit 22 in the second computer.

FIG. 3 shows a block schematic of the status determining unit 20 of thefirst computer connected to the status responding unit 22 of the secondcomputer via network 12. Also here the network 12 is depicted in theform of a straight line.

The status determining unit 20 includes a client control unit 28, whichis connected to the DCOM unit (indicated with the arrow pointingupwards) and to the application unit (indicated with the arrow pointingto the left). The client control unit 28 is connected to a client sendtimer 30 and to a response timer 32. The client control unit 28communicates with a server control unit 34 of the second computer vianetwork 12. The server control unit 34 is connected to a request timer38 and to a server send timer 36.

The functioning of the device will now be described in more detail. Theapplication unit 16 is running some kind of application. It can forinstance be some kind of e-mail application or network file system or acustomer specific client/server application. When the application needssome processing to be done by the server, like for instance to fetchnewly received e-mails, it connects to the client DCOM unit 18 andrequests processing of data in order to make a DCOM call to the server.DCOM calls are normally performed by using the TCP/IP protocol. At thesame time the application connects to the client control unit 28 of thestatus determining unit or HPS (host polling service) client 20 and alsosends the request for processing to this control unit. When the clientcontrol unit 28 receives this request it starts performing the firstpart of the method depicted in FIG. 4. The method is thus started by thereception of a request for a DCOM call by the application, step 40. Oncethe method is started by a first DCOM call by the application, itcontinues monitoring any successive DCOM calls made by the application.The client control unit 28 reads the configuration for HPS servers ithas to poll, step 42, which should be an open configuration. If theresult of the check for configuration is not ok, step 44, an errormessage is generated, step 46, the error is logged and the HPS clientterminates, step 48. If the result is ok, step 44, a request for statusor an HPS request is sent from the client control unit 28 to the servercontrol unit 34, step 50. This request is sent on a special portdedicated to this type of polling and is sent as an empty request packetusing UDP/IP (User Datagram Protocol/Internet Protocol) as protocol. Thepacket includes source and destination port numbers, an indication ofpacket length and a check sum as header and only a zero as data orpayload, where zero indicates a request message. The client control unit28 also starts the client send timer or HPS send timer 30, step 52, andthe response timer or HPS response timer 32, step 54. Thereafter theclient control unit 28 enters a state WaitforResponse, step 56.

The WaitforResponse state is shown in FIG. 5. In the stateWaitforResponse, step 58, the client control unit has two options basedon different events. In the event of the control unit receiving an HPSresponse, step 60, it resets and starts the HPS response timer 32, step62, and thereafter connects to the DCOM unit 18 enabling DCOMcommunication by sending a signal EnaOut_Enabled, step 64, andthereafter enters an enabled state, step 66. The format of the HPSresponse will be described later in relation to the HPS server. Theresponse is received on the same special port number as the request issent on.

The client control unit 28 continuously checks the status of the HPSsend timer 30. In the event of the send timer 30 has reached a time out,step 68, the client control unit 28 decides to send yet another request,step 70 and resets and starts the send timer 30, step 72. The controlunit then finishes, step 74 and returns to the original state,WaitforResponse, step 58.

FIG. 6 shows what happens in the state Enabled. In the state enabled,step 76, the client control unit has three options based on differentevents. If it receives an HPS response, step 78, it resets and startsthe HPS response timer 32, step 80, then finishes, step 82, andthereafter returns to the initial enabled state, step 76.

The client control unit 28 checks the send timer 30 and in the event ofthe client HPS send timer has reached a send timer time-out, step 84,the client control unit sends an HPS request, step 86, and thereafterresets and starts the send timer 30, step 88. When this is done, thecontrol unit 28 finishes, step 90, and returns to the initial enabledstate, step 76.

The client control unit also checks the response timer 32 and in theevent of the HPS response timer has reached a response timer time-out,step 92, the client control unit 28 disables DCOM communication bysending a signal EnaOut_Disabled signal to the client DCOM unit, step94, thereafter the control unit 28 returns to the WaitforResponse state,step 96, which was described earlier in relation to FIG. 5. ThusDCOM-calls are disabled if no HPS response has been received before theresponse timer time-out.

Now the working of the server control unit 34 in the second computerwill be described together with the flow charts of FIGS. 7 and 8.Turning first to FIG. 7, the server control unit 34 is initially in astate WaitforRequest, step 98, where it stays until it receives an HPSrequest from the client, step 100. After it has received this requestthe server control unit 34 resets and starts the HPS request timer 38,step 102 and thereafter sends an HPS response, step 104. The HPSresponse has exactly the same type of header as the received request.The only difference is that the payload or data is a one instead of azero. This one indicates a response to the request. This message is alsosent using UDP/IP. The request and the response are sent and received onthe same special port, having the same port number as the HPS clientuses. Thereafter the server control unit 34 also resets and starts theserver HPS send timer 36, step 106, whereupon it enters an enabledstate, step 108.

The enabled state is depicted in FIG. 8. In the enabled state, step 110,the server control unit 34 has three options based on different events.If an HPS request is received, step 112, the server control unit 34resets and starts the HPS request timer 38, step 114, whereupon theserver control unit 34 finishes, step 116, and returns to the initialenabled state, step 110. The server control unit 34 continuously checksthe server send timer 36 and if a send timer time out occurs, step 118,an HPS response is sent, step 120, the HPS send timer is reset andstarted again, step 122, whereupon the control unit finishes, step 124and returns to the initial enabled state 110. The control unit 34 alsomonitors the request timer 38 and if an HPS request timer time-outoccurs, step 126, the control unit 34 stops the send timer 36, step 128and enters state WaitforRequest, step 130, which state was described inrelation to FIG. 7.

Time-outs are handled in the following way. The timer value is comparedwith a set time-out value in the corresponding control unit and if theset value is reached by the timer, the corresponding action is performedby the control unit in question.

The timer values should match the criteria set for detecting failures atfail-over times.

The client HPS response timer must be set at 2.5 times the server HPSsend timer. This allows for missing one HPS response packet by the HPSclient.

The server HPS request timer must be at least 1.5 times the client HPSsend timer in order to be able to send two responses before a possibleindependent transmission is halted. Tables outlining preferred timersettings are given below. Timer Setting Client Timer 5 seconds HPSresponse timer 2 seconds HPS send timer Server Timer HPS request timer 3seconds HPS send timer 2 seconds

The DCOM unit 18 finally works in the following way. When it receives arequest for a DCOM call by the application it checks if it has receivedan EnaOut_enabled signal from the client control unit prior to makingthe call. If it has not, it returns a fail message immediately to theapplication. This can for instance be realized with a bit of code likethe following: If (HostEnabled) { HRESULT DCOMFunct1(par1, par2, par3);} else { HRESULT = E_FAIL; }

The present invention has been described in the context of control unitsand timer units. These are preferably provided in the form of one ormore processors together with appropriate program memory containingsoftware performing the method. This software can also be provided oncomputer readable mediums, like CD-ROM discs, for loading into acomputer. FIG. 9 shows such a disc 132 containing this program codeeither for the client, the server or both. This disc can naturally alsocontain the above described program code for the DCOM unit.

The described invention has several advantages. By using a simplifiedprotocol for checking status of the server, it is possible to get fastresponses especially in case of network failures. In this way calls forprocessing to a server can be blocked when there is no networkconnection. The application returns faster and doesn't seem to ‘hang’.

As the protocol isn't a standard request/response protocol, but morelike a request/response triggered protocol, the turnaround delay isreduced by half of the value that would be encountered in a standardrequest/response protocol. (i.e. only the transmission delay and nottransmission and reception delay). This is especially advantageous whenthe network is heavily loaded.

By having an appropriately set response timer dependent on the receptionof requests the sending of responses are also kept to a reasonable leveldepending on the interest of the client.

Similarly, the send timers and response timer are set so that requestsfor status are sent at reasonably short intervals and that blocking ofDCOM calls are made fast in a safe way.

The present invention has been described in the context of checkingnetwork failures. It is however possible to use the principles of thepresent invention to check other types of status of the server and theserver application. In this case the request for status and the responseto the request would contain more information in the data section orpayload of the message.

The invention has furthermore been described in the context of TCP/IPand UDP/IP. It should however be realized that it can be used with anysession oriented computer protocol.

The invention has been described in relation to just two computers. Theinvention can be used on more computers depending on how many computersdifferent applications on the client need to request processing from.

The invention has been described in relation to DCOM. The invention canhowever be used for other middleware applications that induce their ownretry mechanism out of the immediate control of the ‘user application’.

1. Method of forestalling long time-outs in a process run on at least afirst computing device in a network, which process makes calls forprocessing to a second computing device, and comprising the steps of:sending a request for status to a second computing device, and in caseof no response on the request for status from the second computingdevice automatically blocking requests for processing of data to be sentthe second computing device.
 2. Method according to claim 1, includingthe step of generating a request for processing of data which causes thesending of the request for status.
 3. Method according to claim 1,wherein the request for status comprises a request for information abouta network connection of the second computing device and the response onthe request for status comprises information about the networkconnection from the second computing device.
 4. Method according toclaim 1, further including the step of setting a time limit within whichthe response to the request for status is to be received and the step ofblocking requests for processing is performed if no response is receivedwithin the time limit.
 5. Method according to claim 4, wherein thesecond computing device has a time limit within which responses are tobe sent to the first computing device and the time limit within whichthe response is to be received is between two and three times longerthan the send time limit of the second computing device.
 6. Methodaccording to claim 1, further including the step of setting a time limitwithin which a request for status is to be sent and the sending of arequest for status is performed when this time limit expires.
 7. Methodaccording to claim 1, wherein requests for status are sent using asimplified first protocol and requests for processing are sent using asecond standard protocol.
 8. Method of determining status of a computingdevice used for receiving calls for processing from another computingdevice via a network, comprising the steps of: receiving a request forstatus from the other computing device, generating at least one responseto the request, and sending the response to the other computing device.9. Method according to claim 8, wherein requests for status andresponses to these requests are received and sent using a firstsimplified protocol.
 10. Method according to claim 8, wherein the stepof generating includes generating more than on response within a requesttime limit without waiting for further requests.
 11. Method according toclaim 10, wherein the time for responding to a request is reset eachtime a request for status is received.
 12. Method according to claim 10,wherein the other computer has a send time limit determining whenrequests for status are to be sent and said request time limit isbetween one and two times longer than this send time limit.
 13. Methodaccording to claim 8, further including the step of setting a time limitfor sending a response and sending the response when said time limitexpires.
 14. Computing device for connection to other computing devicesvia a network comprising: an application unit performing computationaltasks and making requests for processing to another computer device, astatus determining unit connected to the application unit and arrangedto send a request for status to the other computing device, which othercomputing device is to perform a computational task for the applicationunit, and automatically block request for processing of data to theother computing device if no response is received from the othercomputing device, which request for processing is caused by theapplication unit.
 15. Computing device for connection to other computingdevices via a network and comprising: an application unit performingcomputational tasks for another application unit when being requested todo so by the other computing device, and a status responding unitarranged to receive a request for status from the other computingdevice, generate at least one response to the request, and send theresponse to the other computing device.
 16. System of computing devicesincluding at least a first and a second computing device connected toeach other via a network, the first computing device comprising: anapplication unit performing computational tasks, a status determiningunit connected to the application unit and arranged to send a requestfor status to the second computing device, which second computing deviceis to perform some computational tasks for the application unit, andautomatically block request for processing of data to the secondcomputing device if no response is received from the second computingdevice, which request for processing is caused by the application unit,the second computing device comprising: an application unit performingcomputational tasks for the first application unit, and a statusresponding unit arranged to receive the request for status from thefirst computing device, generate at least one response to the request,and send the response to the first computing device.
 17. A programproduct comprising a computer readable medium, having thereon: computerprogram code means, to make a computer execute, when said program isloaded in the computer: sending of a request for status to anothercomputer, and in case of no response on the request for status from theother computer, automatically blocking requests for processing of datato the other computer.
 18. A program product comprising a computerreadable medium, having thereon: computer program code means to make acomputer execute, when said program is loaded in the computer: receivinga request for status from another computer, generating at least oneresponse to the request, and sending the response to the other computer.